Consciousness: Drinking from the Firehose of Experience

نویسنده

Benjamin Kuipers

چکیده

The problem of consciousness has captured the imagination of philosophers, neuroscientists, and the general public, but has received little attention within AI. However, concepts from robotics and computer vision hold great promise to account for the major aspects of the phenomenon of consciousness, including philosophically problematical aspects such as the vividness of qualia, the first-person character of conscious experience, and the property of intentionality. This paper presents and evaluates such an account against eleven features of consciousness “that any philosophical-scientific theory should hope to explain”, according to the philosopher and prominent AI critic John Searle. The Problem of Consciousness Artificial Intelligence is the use of computational concepts to model the phenomena of mind. Consciousness is one of the most central and conspicuous aspects of mind. In spite of this, AI researchers have mostly avoided the problem of consciousness in favor of modeling cognitive, linguistic, perceptual, and motor control aspects of mind. However, in response to a recent discussion of consciousness by the wellknown philosopher and AI critic John Searle (Searle 2004), it seems to me that we are in a position to sketch out a plausible computational account of consciousness. Consciousness is a phenomenon with many aspects. Searle argues that the difficult aspects of consciousness are those that make up the subjective nature of first-person experience. There is clearly a qualitative difference between thinking about the color red with my eyes closed in a dark room, and my own immediate experiences of seeing a red rose or an apple or a sunset. Philosophers use the term qualia (singular quale) for these immediate sensory experiences. Furthermore, when I see a rose or an apple, I see it as an object in the world, not as a patch of color on my retina. Philosophers refer to this as intentionality. This work has taken place in the Intelligent Robotics Lab at the Artificial Intelligence Laboratory, The University of Texas at Austin. Research of the Intelligent Robotics lab is supported in part by grants from the National Science Foundation (IIS-0413257), from the National Institutes of Health (EY016089), and by an IBM Faculty Research Award. Copyright c 2005, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. The position I argue here is that the subjective, firstperson nature of consciousness can be explained in terms of the ongoing stream of sensorimotor experience (the “firehose of experience” of the title) and the symbolic pointers into that stream (which we call “trackers”) that enable a computational process to cope with its volume. Consciousness also apparently constructs a plausible, coherent, sequential narrative for the activities of a large, unsynchronized collection of unconscious parallel processes in the mind. How this works, and how it is implemented in the brain, is a fascinating and difficult technical problem, but it does not seem to raise philosophical difficulties. Other Approaches to Consciousness There have been a number of recent books on the problem of consciousness, many of them from a neurobiological perspective. The more clinically oriented books (Sacks 1985; Damasio 1999) often appeal to pathological cases, where consciousness is incomplete or distorted in various ways, to illuminate the structure of the phenomenon of human consciousness through its natural breaking points. Another approach, taken by Crick and Koch (Crick & Koch 2003; Koch 2003), examines in detail the brain pathways that contribute to visual attention and visual consciousness in humans and in macaque monkeys. Minsky (1985), Baars (1988), and Dennett (1991) propose architectures whereby consciousness emerges from the interactions among large numbers of simple modules. John Searle is a distinguished critic of strong AI: the claim that a successful computational model of an intelligent mind would actually be an intelligent mind. His famous “Chinese room” example (Searle 1980) argues that even a behaviorally successful computational model would fail to have a mind. In some sense, it would just be “faking it.” In Searle’s recent book on the philosophy of mind (Searle 2004), he articulates a position he calls biological naturalism that describes the mind, and consciousness in particular, as “entirely caused by lower level neurobiological processes in the brain.” Although Searle rejects the idea that the mind’s relation to the brain is similar to a program’s relation to a computer, he explicitly endorses the notion that the body is a biological machine, and therefore that machines (at least biological ones) can have minds, and can even be conscious. In spite of being nothing beyond physical processes, Searle National Conference on Artificial Intelligence (AAAI-05) http://www.cs.utexas.edu/ ̃qr/papers/Kuipers-aaai-05.html holds that consciousness is not reducible to those physical processes because consciousness “has a first-person ontology” while the description of physical processes occurring in the brain “has a third-person ontology.” He lays out eleven central features of consciousness “that any philosophicalscientific theory should hope to explain.” In the following three sections, I describe how a robotics researcher approaches sensorimotor interaction; propose a computational model of consciousness; and evaluate the prospects for using this model to explain Searle’s eleven features of consciousness. Sensorimotor Interaction in Robotics When a robot interacts continually with its environment through its sensors and effectors, it is often productive to model that interaction as a continuous dynamical system, moving through a continuous state space toward an attractor. In the situations we will consider, such a dynamical system can be approximated by a discrete but fine-grained computational model, so by taking this view of the robot we are not moving outside the domain of computational modeling. A Simple Robot in a Static World Consider a simple robot agent in a static environment. In a static world, the only possible changes are to the state of the robot’s body within the environment, which is represented by a time-varying state vector x(t). The derivative of x with respect to time is written _ x. For a simple mobile robot moving on a planar surface, x would have the form (x; y; ), representing the pose (position (x; y) plus orientation ) of the robot within its environment. A robot with a more complex body would have a larger state vector x. We distinguish the environment and the robot’s body from the computational process (which we will call the “agent”), that receives the sense vector z(t) from the environment and determines a motor vector u(t) to send out to its body in the environment. Let m be the symbolic state of the agent’s internal computational process. Note that the agent has access to its sense vector z, and can set its own motor vector u, but it only has indirect access to its own state vector x. The coupled system consisting of the robot agent and its environment can be described as a dynamical system. (Here we superimpose two useful standard representations: the block diagram and the differential equation.) World: _ x = F (x;u) (1) z = G(x) (2) ? 6 z u Agent: Hi := Sele t(m; z) (3) u = Hi(z) (4) Equation (1) describes how the robot’s state changes as a function of its current state and the motor vector. The function F represents the physics of the world and the robot’s body, including all the complexities of motor performance, wheel friction, barriers to motion, and so on. F is not known to the agent (or to the human researcher, who typically uses simplified approximations). Equation (2) describes the dependence of the robot’s sensor input on its current state. The function G is also extremely complex and not known to the agent. Equation (3) says that, from time to time, based on its internal symbolic state m and its current observations z, the agent selects a reactive control law Hi which determines the current value u(t) of the motor vector as a function of the current value z(t) of the sensor input (equation 4). For a particular choice of the control law Hi, equations (1,2,4) together define the coupled robot-environment system as a dynamical system, which specifies trajectories of (x(t); _ x(t)) that the robot must follow. The robot’s behavior alternates between (a) following a trajectory determined by a particular dynamical system until reaching a termination condition, and then (b) selecting a new control law Hj that transforms the coupled robot-environment system into a different dynamical system, with different trajectories to follow (Kuipers 2000). The Firehose of Experience When engineering a robot controller, a human designer typically works hard to keep the model tractable by reducing the dimensionality of the state, motor, and sensor vectors. However, as robots become more complex, or as we desire to apply this model to humans, these vectors become very high-dimensional. In a biological agent, the motor vector u includes control signals for hundreds or thousands of individual muscles. An artificial robot could have dozens to hundreds of motors (though a simple mobile robot will have just two). The sensor stream z(t) is what I call “the firehose of experience” — the extremely high bandwidth stream of sensor data that the agent must cope with, continually. For a biological agent such as a human, the sense vector z contains millions of components representing the individual receptors in the two retinas, the cochleal cells in the two ears, and the many touch and pain receptors over the entire skin, not to mention taste, smell, balance, proprioception, and other senses. Robot senses are nowhere near as rich as human senses, but they still provide information at an overwhelming rate. (A stereo pair of color cameras alone generates data at over 440 megabits per second.) With such a high data rate, any processing applied to the entire sensor stream must be simple, local, and parallel. In the human brain, arriving sensory information is stored in some form of short-term memory, remains available for a short time, and then is replaced by newly arriving information. Modifying Searle’s Chinese Room metaphor (Searle 1980), in addition to comparatively infrequent slips of paper providing symbolic input and output, the room receives a huge torrent of sensory information that rushes in through one wall, flows rapidly through the room, and out the other wall, never to be recovered. Inside the room, John can examine the stream as it flows past, and can perhaps record fragments of the stream or make new symbolic notations based on his examination, in accordance with the rules specified in the room. The important point is that the “firehose of experience” provides information at a rate much greater than the available symbolic inference and storage mechanisms can handle. The best we can hope for is to provide pointers into the ongoing stream, so that relevant portions can be retrieved when needed. Pointers into the Sensor Stream The key concept for making sense of the “firehose of experience” is the tracker, a set of symbolic pointers into the sensor stream that maintains the correspondence between a higher-level, symbolically represented concept and its everchanging image in the sensor stream. (Think of tracking a person walking across a scene while you are attending to something else in the scene.) We will describe trackers in the framework of equations (1-4) by equations of the form mk(t) = k(z(t)) (5) meaning that an individual tracker k takes as input the sensor stream z(t) and produces as output the symbolic descriptionmk(t), which is part of the symbolic computational state m(t) of the agent. The subscript k indicates that multiple trackers k may be active at any given time. An individual tracker k may be created “top-down” by a symbolic inference process, or “bottom-up” triggered by detection of a localized feature in the sensory stream z(t). The human visual system includes parallel feature-detection mechanisms that trigger on certain “pop-out” colors and textures. This is not a new idea. Versions of the sensorimotor tracker concept include Minsky’s “vision frames” (1975), Marr and Nishihara’s “spatial models” (1978), Ullman’s “visual routines” (1984), Agre and Chapman’s “indexical references” (1987), Pylyshyn’s “FINSTs” (1989), Kahneman and Triesman’s “object files” (1992), Ballard, et al, “deictic codes” (1997), and Coradeschi and Saffiotti’s “perceptual anchoring” (2003). The feedback-based technology for tracking objects from changing sensor input has its roots in radar signal interpretation from the 1940s (Wiener 1948; Gelb 1974). In the computer vision literature, methods for describing and tracking the outlines of extended objects have been developing since the 1980s (Blake & Yuille 1992; Hutchinson, Hager, & Corke 1996). A tracker embodies one kind of active perception, dynamically maintaining pointers to the image of its target as it moves through the sensor stream. A tracker can be hierarchically structured, with subordinate trackers for parts of the object being tracked. Imagine the agent watching a person walk across the field of view. The tracker dynamically updates the outline of the person against the background, separating figure from ground. The tracker may include subordinate trackers for torso, head, arms, legs, and so on (Marr & Nishihara 1978). Quine (1961) describes human knowledge as a symbolic “web of belief” anchored at the periphery in sensorimotor experience. Trackers are the anchoring devices for symbols. We say that a tracker is bound to a spatio-temporal segment of the sensor stream when that portion of ongoing experience satisfies the criteria of the tracker’s defining concept, and when tracking is successful in real time. The tracker mediates between signal processing and symbol manipulation. At the signal processing end, the tracker implements a dynamical system keeping its pointers corresponding as closely as possible with the relevant portion of the sensor stream. At the symbol manipulation end, the tracker serves as a logical constant, with time-dependent predicates representing the attributes of the tracked object. The location of the tracker within the sensor stream is regulated and updated by control laws, responding to the image properties expected for the tracked concept and their contrast with the background. Image processing strategies such as dynamical “snakes” (Blake & Yuille 1992) represent changing boundaries between figure and ground. With adequate contrast, the high temporal granularity of the sensor stream means that updating the state of the tracker is not difficult. With increasing knowledge about the sensor image of the tracked concept, the tracker can maintain a good expectation about the relevant portion of the sensor stream even in the presence of occlusion, poor contrast, and other perceptual problems. Trackers implement the principle that “the world is its own best model.” When a tracker is bound to a portion of ongoing experience, current sensor data is easily available to whatever symbolic cognitive processes might be active, because the tracker provides efficient access into the correct portion of the sensor stream. The symbolic description of a high-level concept typically includes a number of attributes. If an instance of such a concept is bound to an active tracker, the values of such attributes can be extracted directly from the sensor stream. For example, in response to a query about the color of a tracked object, it is not necessary to retrieve a fact previously stored in memory. A sample can be extracted from the current sensor stream at a location pointed to by the tracker, and classified according to a symbolic taxonomy to produce a meaningful answer to the color question. There is evidence that human confidence in the completeness and quality of perception comes from this ability to retrieve high-quality data from the sensor stream on demand, rather than from complete processing of the image (Ballard, Hayhoe, & Pelz 1995; O’Regan & Noë 2001). If a tracker loses its target, for example due to occlusion or poor image quality, it predicts the location of the target and attempts to reacquire it over some time interval. A hierarchical tracker might lose its more detailed components as the target moves to the periphery of the visual field, perhaps becoming little more than a “blob” tracker representing only location and extent. But if more detailed information is needed, a quick saccade can bring the target back to the fovea, the hierarchical structure of the tracker is rebound, and questions can be answered from vived visual input as if it were continually available (Ballard et al. 1997). The phenomenon of “change blindness” (O’Regan & Noë 2001) illustrates some of the properties of trackers. People can fail to notice startlingly large changes in a visual scene due to intense focus of attention, or due to momentary breaks in the sensor stream that require the trackers to be rebound. A Computational Account of Consciousness We extend our previous simple model of the robot agent to incorporate trackers. We also extend it to non-static worlds by distinguishing between the state x of the robot’s body and the state w of the external world. World: [ _ x; _ w℄ = F (x;w;u) (6) z = G(x;w) (7) ? 6 z u Agent: mk(t) = k(z(t)) (8) m := Update(m; z) (9) Hi := Sele t(m; z) (10) u = Hi(z;m) (11) The trackers (8) provide active, time-varying assertions directly grounded in the sensor stream. Equation (9) refers to the update of the agent’s knowledge representation, based on its prior state and sensor input.1 According to the model proposed here, what it means to be conscious, what it means for an agent to have a subjective, first-person view of the world, is for the agent to have: 1. a high-volume sensor stream z(t) and a motor stream u(t) that are coupled, through the world and the robot’s body, as described by equations (6-7); 2. methods for activating trackers both top-down as directed by the symbolic reasoning system, and bottom-up as triggered by features in the sensory input stream; 3. a collection of trackers mk(t) = k(z(t)) (equation 8) capable of providing direct or indirect grounding in the sensor stream for the logical constants in the agent’s symbolic knowledge representation system; 4. a reasoning system capable of selecting control laws u(t) = Hi(z(t);m(t)) (equation (10-11)). This is an unabashedly “Strong AI” claim. It does not appeal to the Turing Test (Turing 1950), to claim that such a robot behaves as if it were conscious. It says that, if we can examine the detailed nature of a robot agent’s internal processing, and we can show that it corresponds to this model, then we know that it genuinely is conscious. Admittedly, the structural model doesn’t help us know what it “feels like” to be this robot, any more than we can know what it feels like to be a bat, or a dolphin, or John Searle. But because we can verify that it has a “firehose of experience”, and uses trackers to maintain the appropriate causal connections among its perceptions, its actions, and its stored knowledge, then we know that it is conscious. The Update function in equation (9) encapsulates several large, important issues, including how attention is managed and how the coherent sequential narrative of subjective consciousness is constructed (see the discussion of Unity below). For its consciousness to be “human-like”, a robot would have to be sufficiently similar to humans in sensorimotor system, symbolic processing capabilities, knowledge base, and cultural background. What “sufficiently” means here remains to be seen. Consider the difficulty in comparing human and dolphin consciousness, due to differences in sensorimotor system, environment, knowledge, and culture. Evaluating a Theory of Consciousness It is not yet possible to build a robot with sufficiently rich sensorimotor interaction with the physical environment, and a sufficiently rich capability for tracking and reasoning about its sensor and motor streams, to be comparable with human consciousness. The remaining barriers, however, appear to be technical rather than philosophical. The purpose of this essay is to argue that our current approach is plausible as an explanation of consciousness. Searle (Searle 2004, pp. 134–145) provides eleven central features of consciousness “that any philosophical-scientific theory should hope to explain.” We can begin evaluating this theory of consciousness by discussing how well such a computational model might account for these features. Each of the following subsections is titled with Searle’s name for a feature, followed by a quote from his description of it.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Drinking from the firehose of experience

OBJECTIVE Computational concepts from robotics and computer vision hold great promise to account for major aspects of the phenomenon of consciousness, including philosophically problematical aspects such as the vividness of qualia, the first-person character of conscious experience, and the property of intentionality. METHODS We present a dynamical systems model describing human or robotic ag...

متن کامل

The role of self-consciousness in the experience of alcohol-related consequences among college students.

Heavy drinking among college students is a well-established national concern. An in-depth look at the characteristics and traits of heavy drinking students is an essential precursor to the development of successful targeted interventions with at-risk students. The current study examines the role self-consciousness (private, public, social anxiety) plays in the experience of alcohol-related cons...

متن کامل

Enhanced: Separating the Wheat from the Chaff

Seeing the world around you is like drinking from a firehose. The flood of information that enters the eyes could easily overwhelm the capacity of the visual system. To solve this problem, a mechanism-attention-allows selective processing of the information relevant to current goals. As the eminent physiological psychologist Helmholtz [HN2], [HN3], [HN4] noted over a hundred years ago, even wit...

متن کامل

Drinking from the firehose of scientific publishing.

The fundamental question is this: can the wisdom of crowds be exploited to post-filter the literature?

متن کامل